AITopics

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.74)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsFeb-11-2026, 09:25:49 GMT

State Aggregation Learning from Markov Transition Data

Yaqi Duan, Tracy Ke, Mengdi Wang

Inthispaper,wepropose a tractable algorithm that estimates the probabilistic aggregation map from the system'strajectory.

artificial intelligence, disaggregation distribution, machine learning, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Neural Information Processing SystemsFeb-10-2026, 14:24:27 GMT

89ef9ce35c7833cba14bb2381ead6c54-Paper-Conference.pdf

geoscience and remote sensing, ieee transaction, remote sensing, (11 more...)

Country:

North America > United States (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Singapore (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.93)

Industry: Energy (0.33)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Neural Information Processing SystemsFeb-8-2026, 13:44:04 GMT

1be3843e534ee06d3a70c7f62b983b31-Paper-Datasets_and_Benchmarks.pdf

artificial intelligence, machine learning, natural language, (17 more...)

Country:

Europe > Germany > Brandenburg > Potsdam (0.05)
Asia > China > Hubei Province > Wuhan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceDec-5-2025

Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Gao, Xiangyi, Zhao, Danpei, Yuan, Bo, Li, Wentao

Knowledge distillation is an effective and hardware-friendly method, which plays a key role in lightweighting remote sensing object detection. However, existing distillation methods often encounter the issue of mixed features in remote sensing images (RSIs), and neglect the discrepancies caused by subtle feature variations, leading to entangled knowledge confusion. To address these challenges, we propose an architecture-agnostic distillation method named Dual-Stream Spectral Decoupling Distillation (DS2D2) for universal remote sensing object detection tasks. Specifically, DS2D2 integrates explicit and implicit distillation grounded in spectral decomposition. Firstly, the first-order wavelet transform is applied for spectral decomposition to preserve the critical spatial characteristics of RSIs. Leveraging this spatial preservation, a Density-Independent Scale Weight (DISW) is designed to address the challenges of dense and small object detection common in RSIs. Secondly, we show implicit knowledge hidden in subtle student-teacher feature discrepancies, which significantly influence predictions when activated by detection heads. This implicit knowledge is extracted via full-frequency and high-frequency amplifiers, which map feature differences to prediction deviations. Extensive experiments on DIOR and DOTA datasets validate the effectiveness of the proposed method. Specifically, on DIOR dataset, DS2D2 achieves improvements of 4.2% in AP50 for RetinaNet and 3.8% in AP50 for Faster R-CNN, outperforming existing distillation approaches. The source code will be available at https://github.com/PolarAid/DS2D2.

artificial intelligence, distillation, machine learning, (15 more...)

doi: 10.1109/TGRS.2025.3600098

2512.04413

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceNov-27-2025

SAM Guided Semantic and Motion Changed Region Mining for Remote Sensing Change Captioning

Wang, Futian, Wang, Mengqi, Wang, Xiao, Wang, Haowen, Tang, Jin

Remote sensing change captioning is an emerging and popular research task that aims to describe, in natural language, the content of interest that has changed between two remote sensing images captured at different times. Existing methods typically employ CNNs/Transformers to extract visual representations from the given images or incorporate auxiliary tasks to enhance the final results, with weak region awareness and limited temporal alignment. To address these issues, this paper explores the use of the SAM (Segment Anything Model) foundation model to extract region-level representations and inject region-of-interest knowledge into the captioning framework. Specifically, we employ a CNN/Transformer model to extract global-level vision features, leverage the SAM foundation model to delineate semantic- and motion-level change regions, and utilize a specially constructed knowledge graph to provide information about objects of interest. These heterogeneous sources of information are then fused via cross-attention, and a Transformer decoder is used to generate the final natural language description of the observed changes. Extensive experimental results demonstrate that our method achieves state-of-the-art performance across multiple widely used benchmark datasets. The source code of this paper will be released on https://github.com/Event-AHU/SAM_ChangeCaptioning

large language model, machine learning, natural language, (16 more...)

2511.2142

Country: Asia (0.46)

Genre: Research Report > New Finding (0.48)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.88)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

arXiv.org Artificial IntelligenceNov-27-2025

SARVLM: A Vision Language Foundation Model for Semantic Understanding and Target Recognition in SAR Imagery

Ma, Qiwei, Wang, Zhiyu, Liu, Wang, Lu, Xukun, Deng, Bin, Duan, Puhong, Kang, Xudong, Li, Shutao

Synthetic Aperture Radar (SAR) is a crucial imaging modality thanks to its all-weather capability. Although recent advances in self-supervised learning and masked image modeling (MIM) have enabled SAR foundation models, these methods largely emphasize low-level visual features and often overlook multimodal alignment and zero-shot target recognition in SAR imagery. T o address this, we construct SARVLM-1M, a large-scale vision-language dataset with over one million image-text pairs aggregated from existing datasets. W e further propose a domain transfer training strategy to mitigate the large gap between natural and SAR imagery. Building on this, we develop SARVLM, the first vision language foundation model (VLM) tailored to SAR, comprising SARCLIP and SARCap. SARVLM is trained with a vision-language contrastive objective under the proposed domain transfer strategy, bridging SAR imagery and textual descriptions. Extensive experiments on image text retrieval, zero-shot classification, semantic localization, and imagery captioning demonstrate that SARVLM delivers superior feature extraction and interpretation, outperforming state-of-the-art VLMs and advancing SAR semantic understanding. Code and datasets will be released soon.

large language model, machine learning, natural language, (17 more...)

2510.22665

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

arXiv.org Artificial IntelligenceNov-21-2025

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Feng, Zhengpeng, Atzberger, Clement, Jaffer, Sadiq, Knezevic, Jovana, Sormunen, Silja, Young, Robin, Lisaius, Madeline C., Immitzer, Markus, Jackson, Toby, Ball, James, Coomes, David A., Madhavapeddy, Anil, Blake, Andrew, Keshav, Srinivasan

Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient em-beddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. W e employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. W e find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. T o democratize access, adhere to F AIR principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. The model training/inference code, downstream task code, and pre-generated embeddings can be accessed at https://github.com/ucam-eo.

artificial intelligence, machine learning, natural language, (19 more...)

2506.2038

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.81)

Industry:

Information Technology (0.46)
Energy (0.32)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

arXiv.org Artificial IntelligenceNov-18-2025

SOMA: Feature Gradient Enhanced Affine-Flow Matching for SAR-Optical Registration

Wang, Haodong, Zhuo, Tao, Zhang, Xiuwei, Yin, Hanlin, Wu, Wencong, Zhang, Yanning

Achieving pixel-level registration between SAR and optical images remains a challenging task due to their fundamentally different imaging mechanisms and visual characteristics. Although deep learning has achieved great success in many cross-modal tasks, its performance on SAR-Optical registration tasks is still unsatisfactory. Gradient-based information has traditionally played a crucial role in handcrafted descriptors by highlighting structural differences. However, such gradient cues have not been effectively leveraged in deep learning frameworks for SAR-Optical image matching. To address this gap, we propose SOMA, a dense registration framework that integrates structural gradient priors into deep features and refines alignment through a hybrid matching strategy. Specifically, we introduce the Feature Gradient Enhancer (FGE), which embeds multi-scale, multi-directional gradient filters into the feature space using attention and reconstruction mechanisms to boost feature distinctiveness. Furthermore, we propose the Global-Local Affine-Flow Matcher (GLAM), which combines affine transformation and flow-based refinement within a coarse-to-fine architecture to ensure both structural consistency and local accuracy. Experimental results demonstrate that SOMA significantly improves registration precision, increasing the CMR@1px by 12.29% on the SEN1-2 dataset and 18.50% on the GFGE SO dataset. In addition, SOMA exhibits strong robustness and generalizes well across diverse scenes and resolutions.

artificial intelligence, machine learning, registration, (16 more...)

2511.13168

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-4-2025

Geospatial Foundation Models to Enable Progress on Sustainable Development Goals

Ghamisi, Pedram, Yu, Weikang, Zhang, Xiaokang, Rizaldy, Aldino, Wang, Jian, Zhou, Chufeng, Gloaguen, Richard, Camps-Valls, Gustau

Foundation Models (FMs) are large-scale, pre-trained artificial intelligence (AI) systems that have revolutionized natural language processing and computer vision, and are now advancing geospatial analysis and Earth Observation (EO). They promise improved generalization across tasks, scalability, and efficient adaptation with minimal labeled data. However, despite the rapid proliferation of geospatial FMs, their real-world utility and alignment with global sustainability goals remain underexplored. We introduce SustainFM, a comprehensive benchmarking framework grounded in the 17 Sustainable Development Goals with extremely diverse tasks ranging from asset wealth prediction to environmental hazard detection. This study provides a rigorous, interdisciplinary assessment of geospatial FMs and offers critical insights into their role in attaining sustainability goals. Our findings show: (1) While not universally superior, FMs often outperform traditional approaches across diverse tasks and datasets. (2) Evaluating FMs should go beyond accuracy to include transferability, generalization, and energy efficiency as key criteria for their responsible use. (3) FMs enable scalable, SDG-grounded solutions, offering broad utility for tackling complex sustainability challenges. Critically, we advocate for a paradigm shift from model-centric development to impact-driven deployment, and emphasize metrics such as energy efficiency, robustness to domain shifts, and ethical considerations.

artificial intelligence, machine learning, natural language, (19 more...)

2505.24528

Country:

North America > United States (1.00)
Africa (0.93)
Asia > China (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (1.00)
Energy > Renewable > Solar (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)